Robust singing detection in speech/music discriminator design
نویسندگان
چکیده
In this paper, an approach for robust signing signal detection in speech/music discrimination is proposed and applied to applications of audio indexing. Conventional approaches in speech/music discrimination can provide reasonable performance with regular music signals but often perform poorly with singing segments. This is due mainly to the fact that speech and singing signals are extremely close and traditional features used in speech recognition do not provide a reliable cue for speech and singing signal discrimination. In order to improve the robustness of speech/music discrimination, a new set of features derived from harmonic coefficient and its 4Hz modulation values are developed in this paper, and these new features provide additional and reliable cues to separate speech from singing. In addition, a rule-based post-filtering scheme is also described which leads to further improvements in speech/music discrimination. Source-independent audio indexing experiments on PBS Skills database indicate that the proposed approach can greatly reduce the classification error rate on singing segments in the audio stream. Comparing with existing approaches, the overall segmentation error rate is reduced by more than 30%, averaged over all shows in the database.
منابع مشابه
A Music Retrieval System with a Seamless Query Interface by Humming or Song Title
We propose a music retrieval system that enables a user to retrieve a song by two different methods: by singing its melody or by saying its title. To allow the user to use those methods seamlessly without changing a voice input mode, a method of automatically discriminating between singing and speaking voices is indispensable. We therefore designed an automatic vocal style discriminator and bui...
متن کاملNew warped LPC-Based Feature for Fast and robust speech/Music Discrimination
Automatic discrimination of speech and music is an important tool in many multimedia applications. The paper presents a low complexity but effective approach for speech/music discrimination, which exploits only one simple feature, called Warped LPC-based Spectral Centroid (WLPC-SC). A three-component Gaussian Mixture Model (GMM) classifier is used because it showed a slightly better performance...
متن کاملSinging Voice Separation from Monaural Recordings
Separating singing voice from music accompaniment has wide applications in areas such as automatic lyrics recognition and alignment, singer identification, and music information retrieval. Compared to the extensive studies of speech separation, singing voice separation has been little explored. We propose a system to separate singing voice from music accompaniment from monaural recordings. The ...
متن کاملSpeech/Music Discrimination Using a Single Warped LPC-Based Feature
Automatic discrimination of speech and music is an important tool in many multimedia applications. The paper presents a low complexity but effective approach for speech/music discrimination, which exploits only one simple feature, called Warped LPC-based Spectral Centroid (WLPC-SC). A three-component Gaussian Mixture Model (GMM) classifier is used because it showed a slightly better performance...
متن کاملMusic Training Program: A Method Based on Language Development and Principles of Neuroscience to Optimize Speech and Language Skills in Hearing-Impaired Children
Introduction: In recent years, music has been employed in many intervention and rehabilitation program to enhance cognitive abilities in patients. Numerous researches show that music therapy can help improving language skills in patients including hearing impaired. In this study, a new method of music training is introduced based on principles of neuroscience and capabilities of Persian languag...
متن کامل